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Abstract 

Background: Allergic rhinitis is a common disease whose genetic basis is incompletely explained. We report an 
integrated genomic analysis of allergic rhinitis. 

Methods: We performed genome wide association studies (GWAS) of allergic rhinitis in 5633 ethnically diverse North 
American subjects. Next, we profiled gene expression in disease-relevant tissue (peripheral blood CD4+ lymphocytes) 
collected from subjects who had been genotyped. We then integrated the GWAS and gene expression data using 
expression single nucleotide (eSNP), coexpression network, and pathway approaches to identify the biologic 
relevance of our GWAS. 

Results: GWAS revealed ethnicity-specific findings, with 4 genome-wide significant loci among Latinos and 1 
genome-wide significant locus in the GWAS meta-analysis across ethnic groups. To identify biologic context for 
these results, we constructed a coexpression network to define modules of genes with similar patterns of CD4+ gene 
expression (coexpression modules) that could serve as constructs of broader gene expression. 6 of the 22 GWAS loci 
with P-value < 1x1 0~ 6 tagged one particular coexpression module (4.0-fold enrichment, P-value 0.0029), and this 
module also had the greatest enrichment (3.4-fold enrichment, P-value 2.6 x 10~ 24 ) for allergic rhinitis-associated 
eSNPs (genetic variants associated with both gene expression and allergic rhinitis). The integrated GWAS, coexpression 
network, and eSNP results therefore supported this coexpression module as an allergic rhinitis module. Pathway analysis 
revealed that the module was enriched for mitochondrial pathways (8.6-fold enrichment, P-value 4.5 x 10~ 72 ). 
(Continued on next page) 
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(Continued from previous page) 

Conclusions: Our results highlight mitochondrial pathways as a target for further investigation of allergic rhinitis 
mechanism and treatment. Our integrated approach can be applied to provide biologic context for GWAS of other 
diseases. 

Keywords: Genome-wide association study, Allergic rhinitis, Coexpression network, Expression single-nucleotide 
polymorphism, Coexpression module, Pathway, Mitochondria, Hay fever, Allergy 



Background 

Allergic rhinitis is an IgE-mediated inflammation of the 
upper airway that causes naso-ocular congestion, pruri- 
tis, rhinorrhea, and sneezing [1]. Colloquially referred to 
as hay fever, seasonal allergies, and allergies, allergic 
rhinitis is one of the most common chronic diseases, af- 
fecting up to 30% of adults and 40% of children [1], 

A genetic contribution to allergic rhinitis is evident, 
based on an increased incidence and prevalence of aller- 
gic rhinitis among twins and within atopic families [2,3]. 
Despite the high population prevalence of allergic rhin- 
itis, there have been relatively few studies of its genetic 
basis. The National Human Genome Research Institute 
catalogs just one genome wide association study (GWAS) 
of allergic rhinitis [4], for example, compared to 33 for 
asthma and 61 for diabetes [5]. Candidate gene studies 
have been performed with variable effect sizes and levels 
of significance reported [3,6,7]. We are aware of three 
prior GWAS of allergic rhinitis. Andiappan et al. found no 
genome-wide significant loci in a GWAS of allergic rhin- 
itis in 942 Chinese subjects [8]. Ramasamy et al. reported 
one genome-wide significant locus in a GWAS meta- 
analysis of 12,898 Europeans [4], Hinds et al. reported 16 
genome-wide significant loci for self-reported allergy in a 
GWAS meta-analysis of subjects of European ancestry [9]. 
The functional implications of the identified loci were 
not directly examined in these studies. In a GWAS of 
allergen-specific IgE level (i.e. not allergic rhinitis), 
Bonnelykke et al. estimated that ten loci associated with 
allergen-specific IgE level accounted for 25% population- 
attributable risk for allergic rhinitis [10], but this was not 
from a direct study of allergic rhinitis. Of note, loci associ- 
ated with allergen-specific IgE level have not been consist- 
ently associated with allergic rhinitis [4,9]. Given that the 
genetic loci identified to date do not fully explain the esti- 
mated heritability of allergic rhinitis, it is likely that as yet 
unidentified genes and pathways contribute to allergic 
rhinitis pathogenesis. 

GWAS results on their own, while helping to elucidate 
the etiology of disease, do not provide a rich context 
within which to interpret any finding [11,12], For ex- 
ample, for disease-associated SNPs in intergenic regions, 
the gene is not necessarily immediately known [13]. 
Typically the closest gene is identified as the gene of 
interest, but that is not a foolproof algorithm, and the 



pathways affected by the genetic locus are also not ne- 
cessarily immediately apparent [13]. In addition, given 
the stringent P value thresholds that must be adopted in 
a GWAS to declare genome-wide significance, much of 
the data in a GWAS that may inform on disease is ig- 
nored because the association P values (and effect sizes) 
that reflect true associations cannot be distinguished 
from the noise [14], 

Various methods have been tried to identify biologic 
context for loci identified by GWAS, including (1) ex- 
pression quantitative trait loci (eQTL) mapping and ex- 
pression single nucleotide polymorphism (eSNP) analysis 
[10,15-18], (2) network analysis [19,20], and (3) pathway 
analysis [18,21-23]. eQTL mapping and eSNP analysis are 
frequently used [15-18]. The motivation for eQTL map- 
ping and eSNP analysis is that genetic variation is more 
likely to impact a disease trait if it alters gene transcrip- 
tion. Linkage or association methods can be used to 
identify genetic loci influencing gene expression. The 
linkage-based identification of loci for gene expression 
is called eQTL mapping, and the association-based iden- 
tification of SNPs affecting gene expression is called eSNP 
analysis [15]. Because complex traits such as allergic rhin- 
itis are unlikely to be governed by single genes or loci, 
however, eQTL and eSNP analyses alone may provide 
insufficient context. Network approaches can model 
vast networks of gene interactions that modulate dis- 
ease [19,20,24]. Networks are formed by considering 
pairwise relationships between genes, including protein 
interaction relationships and coexpression relation- 
ships [14,24]. Considering GWAS results in the con- 
text of whole-gene networks may thus provide the 
necessary context within which to interpret the disease 
role for a given gene or variant identified by GWAS. 
Finally, pathway analysis can help decipher the func- 
tional implications of coherent groups of genes with re- 
spect to gene ontology functional categories [18,21-23]. 
Pathways representing specific biologic mechanisms may 
be overrepresented in genes identified by GWAS, thereby 
providing relevant biologic context for GWAS results. 

Among all GWAS, some have reported findings with- 
out characterizing the effects of loci on gene expression 
and downstream biologic pathways [4,8,25], while others 
have incorporated eQTL/eSNP, network analysis, and 
pathway analysis individually to provide some evidence 
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for downstream effect [15-17,19,21,23]. Integrative ap- 
proaches have elucidated biologic mechanisms and treat- 
ment targets in a number of disease areas including 
inflammatory bowel disease, Alzheimer's disease, dia- 
betes, heart disease, and obesity [15,24,26-29], but simi- 
lar strategies have not been widely applied to allergy. 
Further, the gene expression data used to support many 
GWAS are drawn from individuals distinct from those 
who were genotyped [16,18,19,21,23], rendering the ana- 
lysis of any effects of genotype on gene expression indir- 
ect and potentially biased due to differences in subjects 
who were genotyped versus subjects with mRNA data. 
For example, while Hinds et al. performed GWAS to 
identify allergy-related loci in a sample of personal gen- 
etics company customers and birth cohort participants 
[9], they then identified expression quantitative trait loci 
(eQTL) among these loci using monocyte gene expres- 
sion data from a distinct study cohort of heart disease. 

We hypothesized that a genome-wide approach to al- 
lergic rhinitis integrating GWAS with eSNP, coexpres- 
sion, and pathway analyses using gene expression data 
generated from disease-relevant tissue collected from 
the same individuals who were genotyped could enhance 
the power over standard GWAS to identify disease- 
relevant loci. Such an approach could not only provide 
more robust biological context, but also leverage data 
from cohorts that may not be large enough to yield high 
numbers of genome-wide significant GWAS results for 
complex traits such as allergic rhinitis. Here we present 
our integrated genomic analysis of allergic rhinitis, 
where we not only identified genome-wide significant 
genetic variants associated with allergic rhinitis, but 
also explored the biologic context for these results by 
profiling gene expression from CD4+ lymphocytes col- 
lected from genotyped subjects and performing expres- 
sion single nucleotide polymorphism (eSNP), network, 
and pathway analyses. Our integrated approach identi- 
fied a novel pathway in allergic rhinitis. 

Results 

Our integrated genomic analysis of allergic rhinitis yielded 
results from GWAS, gene expression profiling, and their 
integrated analysis (Figure 1). We first describe the results 
of our GWAS of allergic rhinitis in 5633 ethnically diverse 
North American subjects, where we identified genome- 
wide significant loci that were specific to ethnicity 
(Figure 1, pink box). We then describe the results of 
our gene expression profiling of immune cells key to 
allergy (CD4+ lymphocytes [30]), collected from the 
peripheral blood of selected subjects who had under- 
gone GWAS (Figure 1, blue box). We share the results 
for the weighted gene coexpression network [31] we 
constructed to identify modules of genes expressed to- 
gether. Finally, we describe the integration of our 



GWAS and gene expression analyses (Figure 1, purple 
box), where we performed eSNP analysis to assess for 
the association between genetic variation and gene ex- 
pression (Figure 1, purple path), assessed GWAS loci 
for eSNPs (Figure 1, turquoise path), identified coexpres- 
sion modules tagged by GWAS loci (Figure 1, orange 
path), and analyzed coexpression modules for enrichment 
of allergic- rhinitis associated eSNPs (Figure 1, green path) 
[15]. We then used pathway analysis to further inform on 
the biological context for our integrated findings. 

GWAS 

Subject characteristics 

The baseline characteristics of the participating subjects 
are shown in Table 1. In total, there were 5633 subjects 
from 7 EVE Consortium study centers [25] in the United 
States, Mexico, and Barbados who were assessed for al- 
lergic rhinitis. 2756 (49%) were female. Participants were 
diverse, with 2034 (36%) European American, 2326 
(41%) Latino, and 1273 (23%) African American/ African 
Caribbean. The overall prevalence of allergic rhinitis 
cases was 48% (2712 subjects). 

GWAS and meta-analysis 

Because subjects were ethnically diverse, we pooled 
genotype data from the 7 study centers into three ethnic 
groups for GWAS: European American, Latino, and 
African- American/ African Caribbean (Figure 1, pink 
box) and controlled for population stratification within 
each ethnic group using principal components. Figure 2 
shows the results of genome-wide association studies for 
allergic rhinitis among European Americans, Latinos, 
and African-American/ African-Caribbeans, in addition 
to the results of the meta-analysis across these ethnic 
groups. For additional views, Additional file 1: Figure SI 
shows the Manhattan plots separately for each ethnic 
group and for the meta-analysis. There were distinct 
findings for each ethnic group. Figure 3 summarizes 
the results for the 22 loci with P value for associ- 
ation < 1 x 10" 6 in at least one of the ethnic groups or in 
the meta-analysis. We show loci meeting this threshold to 
include loci with suggestive associations (P value < 1 x 10~ 6 ) 
in addition to those genome-wide significant (defined as P 
value < 5 x 10~ 8 ), as loci not meeting strict definitions of 
genome-wide significance can have biologic relevance 
[11,12]. Allele frequencies are shown in Additional file 2: 
Table SI, and a QQ plot for the GWAS meta-analysis is 
shown in Additional file 3: Figure S2. The genomic inflation 
factor was 1.06, supporting adequate control for population 
stratification. 

Four loci on chromosomes 2p22.3 near LINC0048, 
3q29 near DLG1, 10pl5.1 near AKR1E2, and 19ql3.43 
near ZNF776 were genome-wide significant among Lati- 
nos (Figure 3). The regional association plots for these 
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Figure 1 Study flow for the integrated genome-wide association, coexpression network, and expression single nucleotide polymorphism 
analysis of allergic rhinitis. CHS = Children's Health Study, CAMP = Childhood Asthma Management Program, CAG = Chicago Asthma Genetics 
Study, CSGA = Collaborative Studies on the Genetics of Asthma, SARP = Severe Asthma Research Program, GALA1 = Genetics of Asthma in Latinos, 
MCCAS = Mexico City Childhood Asthma Study, GRAAD = Genomic Research on Asthma in the African Diaspora and Barbados, SAPPHIRE = Study of 
Asthma Phenotypes and Pharmacogenomic Interactions by Race-Ethnicity. Detailed descriptions of the individual studies have been previously 
described [25]. 



loci are shown in Additional file 4: Figure S3. For con- 
text, each of these SNPs was directly genotyped in 2 of 
the 7 populations, imputation was performed using very 
conservative metrics [25], and the imputation scores for 
these SNPs demonstrated good confidence (Additional 
file 5: Table S2). The regional LD plots for these loci 
(Additional file 6: Figure S4) show that there were lim- 
ited SNPs in LD with these genome-wide significant loci. 
The locus marked by rs7780001 on chromosome 7p21.1 
near FERD3L was genome-wide significant in the meta- 
analysis across ethnic groups (P value 2.0 x 10 -8 ; Figure 3 
and Additional file 7: Figure S5) and had nominally signifi- 
cant associations in all three ethnic groups. The loci marked 
by rs2884670 on chromosome 12pl3.32 near DYRK4 and 
rs7237244 on chromosome 18qll.2 near LAMA3 also had 
nominally significant associations in all three ethnic groups. 



Among the 17 loci previously identified by GWAS as asso- 
ciated with allergic rhinitis [4,9], four were associated 
with allergic rhinitis with P value < 0.05 in our study 
(Additional file 8: Table S3). 

Individuals with allergic rhinitis frequently have co- 
morbid asthma [1,32]. Indeed, we observed that 2051 
(76%) of those with allergic rhinitis had asthma, and 
1195 (41%) of those without allergic rhinitis had asthma. 
As subphenotypes of AR based on asthma status are 
possible, we also performed secondary GWAS stratified 
by asthma status. These results are shown in the sup- 
plementary file (Additional file 9: Supplementary Re- 
sults 1, Additional file 10: Table S4, and Additional file 11: 
Figure S6) and similarly showed ethnicity-specific findings. 
In Additional file 12: Table S5, we show the sample compos- 
ition of the stratified analyses according to asthma status. 
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Table 1 Baseline characteristics of North American subjects included in the study 

Study 3 





CHS 


CAMP 


CAG/CSGA/SARP 


GALA1 


MCCAS 


GRAAD 


SAPPHIRE 


Number 


2881 


384 


283 


521 


476 


809 


279 


Age (years) 


8.3 (5.2-14.3) 


8.8 (5.2-13.2) 


27.3 (6.0-81.0) 


14.8 (8.0-40.0) 


9.0 (5.0-17.0) 


40.0 (14.0-8 


4.0) 30.3 (12.0-56.0) 


Female 


1344 (47%) 


142 (37%) 


150 (53%) 


230 (44%) 


198 (42%) 


474 (59%) 


219 (78%) 


Race 
















European American 


1552 (54%) 


384 (100%) 


98 (35%) 










Latino 


1329 (46%) 






521 (100%) 


476 (100%) 






African American/African Caribbean 






185 (65%) 






809 (100%) 


279 (100%) 


Allergic Rhinitis 


1096 (38%) 


199 (52%) 


245 (87%) 


434 (83%) 


250 (53%) 


377 (47%) 


1 1 1 (40%) 


Asthma 


1206 (42%) 


384 (100%) 


283 (100%) 


521 (100%) 


476 (100%) 


228 (28%) 


148 (53%) 


Genotyping platform b 


550 K, 610 K 


550 K 


1Mv1 


6.0 


550 K 


650 K 


6.0 



Values are mean (range) or number (percent). 

a CHS = Children's Health Study, CAMP = Childhood Asthma Management Program, CAG = Chicago Asthma Genetics Study, CSGA = Collaborative Studies on the 
Genetics of Asthma, SARP = Severe Asthma Research Program, GALA1 = Genetics of Asthma in Latinos, MCCAS = Mexico City Childhood Asthma Study, GRAAD = Genomic Research 
on Asthma in the African Diaspora and Barbados, SAPPHIRE = Study of Asthma Phenotypes and Pharmacogenomic Interactions by Race-Ethnicity. Detailed descriptions of the individual 
studies have been previously described [25]. 

b The lllumina arrays used were the IMvl, 550 k, 610 k and 650 k. The Affymetrix arrays used were the 500 k and 6.0. 



Genome-wide CD4+ gene expression and coexpression 
network to enhance GWAS 

To assess the potential biological impact of the loci identified 
in our GWAS analyses, we collected and measured genome- 
wide gene expression in disease-relevant tissue (peripheral 
blood CD4+ lymphocytes) from 200 subjects who had 
undergone GWAS and constructed a gene coexpression net- 
work based on the gene expression data (Figure 1, blue box). 
We built the coexpression network to identify coexpressed 
gene modules (i.e. groups of genes with similar patterns 
of expression profiles and interconnectivity across the 



experimental samples), as these could serve as broader 
constructs of gene expression and provide a path to 
discover broader biologic context [31]. 

We achieved CD4+ lymphocyte yields of ~4xl0 6 cells 
at >95% purity per collection. Bioanalyzer (Agilent Tech- 
nologies, Santa Clara, CA) analysis confirmed average 
total RNA yields of 2 ug per collection, with minimal 
evidence of RNA degradation and 28S:18S ratios ap- 
proaching 2.0. 

Figure 4A shows the coexpression network we con- 
structed using weighted gene coexpression network analysis 



00 

- 




i 1 1 1 1 1 1 1 1 1 1 1 1 1 — i — i — i — i — n — m 

1 2 3 4 5 6 7 8 9 10 11 12 13 15 17 19 22 

Chromosome 
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Figure 2 Manhattan plot of the genome-wide association and meta-analysis results for allergic rhinitis showing ethnicity-specific 
findings. 
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P value 3 


OR (95% CI) 


Coexpression modules 


SNP 


Location 


Nearest 
gene 


European 
American 


Latino 


African 
American 


Meta - 
Analysis 


European 
American 


Latino 


African 
American 


Meta-Analysis 


Brown 


Pink 


Magenta 


Red 


MN 
blue 


Steel 
blue 


rs868688 


lp36.32 


PRDM16 


0.49 


0.89 


7.3E-07 


0.025 


0.95 (0.82-1.09) 


0.99 (0.86-1.12) 


0.53 (0.28-0.78) 


0.91 (0.83-0.99) 














rs2149039 


lp32.3 


BSND 


0.61 


6.9E-07 


0.06 


0.002 


0.95 (0.76-1.15) 


0.64 (0.46-0.82) 


1.05 (0.98-1.11) 


0.82 (0.73-0.93) 














rsll680788 


2p22.3 


LINC00486 


0.50 


3.8E-08 


0.37 


7.5E-07 


0.88 (0.49-1.26) 


0.47 (0.19-0.74) 


0.68 (0.00-1.52) 


0.58 (0.47-0.72) 














rs6583203 b 


3q29 


DLG1 


0.78 


1.4E-08 


0.44 


4.7E-04 


1.03 (0.83-1.23) 


1.65 (1.48-1.83) 


0.91(0.69-1.14) 


1.23 (1.09-1.37) 














rs4713039 


6p24.2 


SYCP2L 


0.095 


6.0E-07 


0.82 


0.024 


0.86 (0.69-1.04) 


1.57(1.39-1.75) 


1.05 (0.60-1.51) 


0.87 (0.75-0.99) 










rs6583337 


7p22.3 


FAM20C 


0.32 


0.18 


1.0E-07 


0.035 


1.08 (0.93-1.23) 


0.89 (0.71-1.06) 


0.46 (0.18-0.75) 


0.89 (0.79-1.00) 














rs7780001 c 


7p21.1 


FERD3L 


2.9E-05 


0.0015 


0.02 


2.0E-08 


0.67 (0.49-0.86) 


0.78 (0.62-0.93) 


0.72 (0.44-1.00) 


0.73 (0.62-0.84) 














rsl0156309 


8ql2.3 


NKAIN3 


2.5E-07 


0.86 


0.90 


1.8E-05 


0.31 (0.00-0.75) 


0.93 (0.15-1.71) 


1.08 (0.00-2.19) 


0.45 (0.31-0.65) 














rsl0124907 d 


9p21.2 


TUSC1 


0.26 


5.9E-07 


0.69 


0.0057 


1.09 (0.94-1.23) 


0.70 (0.57-0.84) 


0.95 (0.71-1.20) 


0.88 (0.79-0.97) 














rsl332366 


9q21.13 


OSTF1 


0.63 


0.94 


9.8E-07 


0.047 


1.05 (0.86-1.24) 


0.99 (0.77-1.22) 


0.50 (0.22-0.78) 


0.88 (0.75-1.01) 










rs2472448 


9q31.1 


ABCA1 


0.87 


7.6E-07 


0.51 


8.8E-04 


1.02 (0.79-1.25) 


0.55 (0.32-0.79) 


0.87 (0.46-1.28) 


0.77 (0.66-0.90) 










rsl7133587 


10pl5.1 


AKR1E2 


0.44 


4.5E-09 


0.89 


1.0E-06 


1.10 (0.85-1.35) 


1.80(1.61-2.00) 


0.96 (0.85-1.08) 


1.45 (1.25-1.69) 














rsll027293 


llpl4.3 


SVIP 


0.22 


6.5E-07 


0.25 


7.3E-06 


1.33 (0.87-1.80) 


3.38 (2.90-3.86) 


2.26 (0.87-3.65) 


2.10(1.52-2.91) 


: 












rsl893361 


llql3.4 


CHRDL2 


0.39 


7.2E-07 


0.33 


0.0020 


0.91 (0.70-1.12) 


1.66(1.46-1.86) 


1.16 (0.87-1.45) 


1.23 (1.10-1.36) 














rs2884670 


12pl3.32 


DYRK4 


9.1E-04 


0.0014 


0.0081 


1.5E-07 


1.28(1.13-1.42) 


1.25 (1.11-1.38) 


1.39(1.15-1.63) 


1.28(1.19-1.37) 














rsl352323 


15q26.1 


ST8SIA2 


0.018 


7.2E-07 


0.19 


9.5E-08 


1.26(1.07-1.45) 


1.47(1.32-1.63) 


1.16 (0.94-1.39) 


1.33 (1.23-1.44) 














rsl2597084 


16pl3.3 


RBF0X1 


0.46 


5.0E-07 


0.59 


1.08E-04 


0.95 (0.80-1.09) 


0.70 (0.55-0.84) 


1.11 (0.73-1.50) 


0.82 (0.72-0.92) 














rs7187423 


16ql2.2 


FTO 


0.032 


0.92 


5.7E-07 


0.046 


0.71 (0.40-1.02) 


1.02 (0.70-1.33) 


2.02(1.75-2.30) 


1.19 (1.02-1.37) 














rs2061 


17pl2 


PMP22 


0.0047 


0.40 


5.1E-05 


8.9E-07 


0.53 (0.10-0.97) 


0.87 (0.53-1.20) 


0.46 (0.09-0.84) 


0.49 (0.21-0.77) 














rs7237244 


18qll.2 


LAMA3 


0.045 


1.5E-05 


0.0067 


4.3E-07 


0.62 (0.14-1.09) 


0.60 (0.36-0.83) 


0.09 (0.00-1.85) 


0.58 (0.47-0.72) 
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a Results are shown for loci with P value for association < 1 x 10" 6 in at least one ethnic group or in the meta-analysis. Genome-wide significant associations (P value < 5 x 10" 8 ) are bolded. 
b rs6583204 and rsl018527 were in high LD with this variant and showed similar association patterns. 
c rsl3446705 was in high LD with this variant and showed similar association patterns. 
d rsl0124903 was in high LD with this variant and showed similar association patterns. 
e rsl2972098 was in high LD with this variant and showed similar association patterns. 

Figure 3 Results of the genome-wide association studies of allergic rhinitis, meta-analysis, and GWAS tagging of the coexpression 
network. 



of the CD4+ lymphocyte gene expression data. In total, there 
were 41 coexpressed gene modules identified by the coex- 
pression network, and their interconnectivities are shown. 
For ease of visualization, modules are identified by color. 

Using pathway analysis, we found that the modules 
were enriched for a variety of gene ontology (GO) path- 
ways reflecting the functions being carried out by each 
module. Pathways associated with the largest coexpres- 
sion modules are shown in the legend of Figure 4A. For 
example, the brown module highlighted in Figure 4B 
was enriched for mitochondrial function. Zinc finger, in- 
flammatory response, and immunoglobulin domain were 
other pathways highlighted by examining the coexpres- 
sion modules for functional enrichment (Figure 4A). 

Integration of GWAS and CD4+ gene expression to 
explore biologic context for GWAS 

To explore the biologic context for our GWAS results, 
we analyzed our GWAS and gene expression findings 
together (Figure 1, purple box). 

GWAS loci that are eSNPs 

We first performed eSNP analysis to assess for the asso- 
ciation between genetic variation and gene expression 
(Figure 1, purple path). We then examined the GWAS 
and eSNP results together to identify GWAS loci that 
were eSNPs (Figure 1, turquoise path), as genetic vari- 
ation that is associated with both the trait and gene 



expression is more likely to be biologically relevant than 
variants that are associated with the trait only with no 
effect on gene expression. We found that the 19ql3.43 
locus near ZNF776 was associated with allergic rhinitis 
(GWAS P value 5.0 x 10" 8 ) as well as CD4+ gene ex- 
pression (x 2 = 19.55, FDR-adjusted P value 0.00078). The 
other loci identified by GWAS were not associated with 
CD4+ gene expression. 

Given the relatively modest size of our sample and the 
fact that we were examining a complex trait, we had an- 
ticipated that traditional GWAS would uncover only a 
small number of biologically relevant loci, even with the 
aid of eSNP analysis, as such an approach would rely 
upon detection of single variant associations with trait 
and expression. We therefore sought to leverage the 
CD4+ expression data more broadly through coexpres- 
sion network and pathway analysis. 

Coexpression modules tagged by GWAS loci 

Compared to individual genes, coexpression modules 
identified through coexpression network analysis (Figure 4) 
can serve as more general constructs of gene expression, 
providing a path to discover broader context and related 
loci [14,28,31,33-40]. Motivated by the same rationale that 
genetic variation that is associated with both the trait and 
gene expression is more likely to be biologically relevant 
than variants associated with the trait only, we mapped 
GWAS loci to CD4+ coexpression modules and examined 
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(See figure on previous page.) 

Figure 4 CD4+ lymphocyte coexpression network with detail of the brown coexpression module. A. Each circle represents a gene. 
Weighted gene coexpression analysis identified groups of genes with similar patterns of gene expression and interconnectivity (coexpression 
modules). The 41 coexpression modules identified are labeled by color. Pathways associated with the largest coexpression modules are denoted 
in the legend. B. Interconnectivity of the brown coexpression module is shown in detail. Tagged by 6 allergic rhinitis GWAS loci, this 
coexpression module was highly enriched for allergic rhinitis-associated eSNPs (3.4-fold enrichment, FDR-adjusted P value = 2.6 x 10~ 24 ) and also 
highly enriched for pathways related to mitochondrial function (8.6-fold enrichment, FDR-adjusted P value = 4.5 x 10~ 72 ). Genes containing 
allergic rhinitis-associated eSNPs are marked in brown, with those containing eSNPs with lowest P-value for association between genotype 
and gene expression marked with greatest brown saturation. Genes in pathways related to mitochondrial function are marked by diamonds with 
blue outline. Higher correlation between gene expression is shown with thicker and darker edges. 



the modules that were tagged by GWAS loci (Figure 1, or- 
ange path). Specifically, we defined a GWAS locus as tag- 
ging a coexpression module if a coexpression module 
contained a gene within 250 kb of the locus. We found 
that 9 of the 22 GWAS loci tagged at least one coex- 
pression module and 6 coexpression modules in total 
(Figure 3). These 6 modules (the brown, pink, magenta, 
red, midnight blue, and steel blue coexpression modules) 
tagged by GWAS loci were therefore considered candidate 
allergic rhinitis associated modules that could inform on 
allergic rhinitis biology. The 19ql3.43 locus near ZNF776 
was among the GWAS loci tagging coexpression modules, 
corroborating our eSNP results and illustrating the in- 
creased power of detection gained by using the coex- 
pression module as a more general construct of gene 
expression. Of note, some of candidate allergic rhinitis 
associated modules were tagged by GWAS loci that 
would not have been considered remarkable by trad- 
itional criteria for genome-wide significance of individ- 
ual loci (P value < 5.0 x 1CT 8 ). Our approach of using 
GWAS loci to tag coexpression modules therefore 
allowed us to gain additional utility from our GWAS 
results. 

Among the 6 candidate allergic rhinitis associated mod- 
ules tagged by GWAS loci, the brown module (represent- 
ing mitochondrial pathways according to pathway analysis 
(Figure 4)) was tagged by 6 of the 22 GWAS loci (Figure 3). 
This proportion represented a significant enrichment 
(4.0-fold enrichment, P-value 0.0029) over chance, sup- 
porting a connection between these GWAS loci and mito- 
chondrial pathway functions. 



Coexpression module enrichment for allergic rhinitis-associated 
eSNPs 

To further ascertain whether any of the 6 candidate 
allergic rhinitis associated modules were underlying 
allergic rhinitis susceptibility, we tested whether these 
modules were enriched for eSNPs that were also associ- 
ated with allergic rhinitis (Figure 1, green path). eSNPs 
(i.e. SNPs associated with gene expression) represent 
functionally validated SNPs of interest in that they are 
associated with expression levels of genes in a cell type 
relevant to the disease under study [15]. While such 
individual associations may not be meaningful, the 
pattern of associations enriched within a given coex- 
pression module can provide strong statistical support 
for module involvement at the genetic level in the disease 
[14,28,31,33-40]. The additional association of eSNPs 
within a coexpression module with the disease of interest 
provides further statistical support that the coexpression 
module is involved in the disease, as the module is then 
not only enriched for eSNPs (e.g. SNPs associated with 
CD4+ gene expression), but more specifically, enriched 
for disease-associated eSNPs (e.g. SNPs associated with 
both CD4+ gene expression and allergic rhinitis). 

In this instance, we identified the brown module as giv- 
ing rise to the greatest enrichments of allergic-rhinitis- 
associated eSNPs (3.4-fold enrichment; FDR-adjusted 
Fishers Exact Test P value 2.6 x 1(T 24 ) (Figure 5), thus 
providing statistical support for involvement of the brown 
module in allergic rhinitis. Pathway analysis revealed that 
the brown module was enriched for mitochondrial path- 
ways (8.6-fold enrichment, FDR-adjusted Fisher Exact 
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a Module enrichment for eSNPs that were also associated with allergic rhinitis. 

Figure 5 eSNP enrichment and pathway analysis of coexpression modules tagged by GWAS loci. 
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Test P value = 4.5 x 10" 72 ) (Figure 5 and Figure 4B). The 
red and midnight blue modules were also enriched for 
allergic rhinitis-associated eSNPs, and both these mod- 
ules were enriched for mitochondrial pathways as well 
(Figure 5). The pink module, enriched for the GO term 
intracellular organelle lumen, functionally overlaps with 
mitochondrial pathway. Thus, the candidate allergic 
rhinitis-associated modules were all significantly enriched 
for allergic rhinitis -associated eSNPs, and pathway ana- 
lysis results for at least half of these modules highlighted 
mitochondrial pathways as linked to allergic rhinitis. Re- 
sults from randomized networks did not yield meaningful 
results (Additional file 13: Supplementary Results 2, 
Additional file 14: Figure S7). 

In summary, pathway analysis of our results from the 
integration of GWAS and coexpression network analysis 
(showing a significantly high number of GWAS loci tag- 
ging the brown module), as well as from the integration 
of eSNP analysis and coexpression network results 
(showing greatest enrichment of the brown module for 
allergic rhinitis-associated eSNPs), both pointed to mito- 
chondrial pathways as playing an important role in aller- 
gic rhinitis (Figure 1). Key to these results was that the 
coexpression network helped organize the expression 
traits into coherent, highly interconnected modules reflect- 
ing the biological processes at play in the tissue. By using 
GWAS loci to tag coexpression modules and then requir- 
ing these tagged coexpression modules to be enriched for 
eSNPs that were also associated with allergic rhinitis, we 
were able to place candidate GWAS associations in a more 
informed context that not only provided a biological con- 
text for GWAS interpretation, but enhanced confidence in 
the suggestive hits given an enrichment of multiple func- 
tional SNPs associating with the disease phenotype [15]. 
Through this approach, we found consistent evidence that 
mitochondrial pathways likely play a role in the genetics 
and pathophysiology of allergic rhinitis. 

Discussion 

A motivation for genome-wide studies is the desire to 
identify novel pathways and mechanisms in disease 
pathogenesis. A limitation of traditional genome-wide 
association studies is that statistically significant loci 
may be identified [4,9], but the biological relevance of 
the individual or aggregate variants are often not evident 
[11,12]. This is not overcome by replication of genotype 
associations, which has been the usual path taken to fol- 
low up GWAS findings, and one which has led to limited 
success [7,9,41]. Our study demonstrates the advantages 
of integrating network approaches with GWAS to identify 
and prioritize pathways and gene targets of biologic rele- 
vance. By integrating our GWAS findings with eSNP, 
coexpression, and pathway analyses using gene expression 
data from disease-relevant tissue generated from 



subjects who had undergone GWAS, we tested the po- 
tential biologic context of our GWAS findings through 
integrative methods and identified a novel pathway in 
allergic rhinitis— mitochondrial pathways. Our method 
allowed us to leverage data from multiple GWAS loci 
to identify biologic context for the aggregate findings. 
This is in contrast to traditional GWAS, where the im- 
plications of individual SNP associations are often 
challenging to define [11,12]. Because complex traits 
such as allergic rhinitis are unlikely to be governed by 
single variants, strategies that capitalize on broader 
constructs of GWAS and gene expression results are 
more likely to yield informative disease context. We 
adopted such a strategy and were able to identify novel 
biologic context for allergic rhinitis. Further, our ap- 
proach of integrating genotype and gene expression 
data generated from the same sample has not been 
widely applied to the study of allergic diseases. Our 
methods can be used to provide a richer biologic con- 
text for GWAS findings in other disease areas. 

While mitochondrial pathways have not been associ- 
ated with allergic rhinitis pathogenesis in traditional de- 
scriptions [1] or genetic studies of the disease [4,8,9], 
our findings and those from laboratory-based studies of 
airway dysfunction support a role for mitochondrial 
perturbations in allergic rhinitis pathogenesis. There is a 
strong link between upper (e.g. nasal) and lower (e.g. bron- 
chial) airway disease pathogenesis [42], and mitochondrial 
perturbations have been observed to affect airway inflam- 
mation. Mitochondria are the major source of endogenous 
reactive oxygen species, which are required for normal 
function of the acquired immune response, including nor- 
mal T-cell activation, B-cell differentiation, and T-cell and 
B-cell proliferation [43]. Because alterations in the acquired 
immune response are observed in allergic inflammation 
and allergic rhinitis, mitochondrial disruption could play a 
role in allergic rhinitis. There are some experimental data 
in support of this hypothesis. OVA-induced allergic airway 
inflammation in BALB/c mice triggers mitochondrial dys- 
function, including the reduction of cytochrome c oxidase 
activity in lung mitochondria, reduction in the expression 
of subunit III of cytochrome c oxidase in bronchial epithe- 
lium, appearance of cytochrome c in lung cytosol, and 
mitochondrial ultrastructural changes such as loss of cris- 
tae and swelling [44]. Experiments using pollen, rather 
than ova, to induce allergic inflammation more akin to al- 
lergic rhinitis in humans, have also shown mitochondrial 
disturbance. Pollen grains and subpollen particles have 
intrinsic NADPH oxidases [45]. Upon hydration in the air- 
way epithelium they produce reactive oxygen species that 
induce oxidative stress [46]. The pollen-induced oxidative 
stress damages mitochondrial respiratory chain proteins 
(specifically NADH dehydrogenase Fe-S protein (NDUFS) 
and ubiquinol-cytochrome c reductase core (UQCRC)) in 
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human airway epithelial cells, triggers reactive oxygen 
species production from mitochondrial respiratory chain 
complex III, and induces mitochondrial dysfunction in 
complex III [47]. 

Mitochondrial changes induced by pollen can serve as the 
second hit leading to allergic inflammation if there is preex- 
isting mitochondrial dysfunction. Indeed, when treated in- 
tranasally with pollen extract, mice with mitochondrial 
dysfunction (induced deficiency in UQCRC2) demonstrated 
evidence of allergic airway inflammation, in contrast to 
UQCRC2-sufficient control mice challenged with the 
same pollen extract [47]. Specifically, these mice with 
mitochondrial dysfunction exhibited a 4.4 fold increase 
in bronchoalveolar lavage eosinophil counts, increased 
accumulation of peribronchial inflammatory cells, en- 
hanced mucous cell metaplasia in airway epithelium, and 
increased airway hyperresponsiveness [47]. Although these 
studies characterized changes in the lower airway, the 
strong link between upper and lower airway disease patho- 
genesis [42] suggests that analogous changes in the upper 
airway could cause individuals with preexisting mitochon- 
drial dysfunction to develop allergic inflammation leading 
to allergic rhinitis with pollen exposure. Our results high- 
light this as a potential mechanistic area, as 27% (6/22) of 
the genetic loci for allergic rhinitis that we identified by 
genome-wide association analysis tagged a gene coexpres- 
sion module that was not only markedly enriched for 
eSNPs associated with allergic rhinitis (3.4-fold, FDR- 
adjusted P-value 2.6 x 10~ 24 ), but also significantly enriched 
for mitochondrial pathways by pathway analysis (8.6-fold 
enrichment, FDR-adjusted P value 4.5 x 10" 72 ). 

Population-based studies additionally support a role for 
mitochondrial pathways in allergic rhinitis pathogenesis. 
Mitochondria are the primary sites of oxidative reactions. 
Levels of malondialdehyde (a marker of oxidative stress) 
are higher, and levels of reduced glutathione (an antioxi- 
dant) are lower in the exhaled nasal condensates of aller- 
gic rhinitis subjects compared to healthy controls [48]. 
The epidemiological link between maternal history of 
atopy (as opposed to paternal history of atopy) and greater 
risk for allergic rhinitis in offspring [49,50] may be ex- 
plained by the fact that mitochondria are maternally trans- 
mitted. Consistent with this, mitochondrial haplotypes are 
associated with intermediate phenotypes of allergic rhin- 
itis, including total serum IgE levels and skin prick test re- 
activity [51]. 

Our results suggest that reducing mitochondrial dys- 
function could improve allergic rhinitis. In murine models 
of allergic airway inflammation, intratracheal administra- 
tion of an antioxidant known to enter mitochondria and 
protect the electron chain from oxidative damage [52] de- 
creased allergen-induced airway hyperreactivity and air- 
way inflammation severity, as shown by reduced numbers 
of inflammatory cells in bronchoalveolar fluid [53] . Again, 



these studies focused on lower airway disease, but the 
strong link between upper and lower airway disease 
pathogenesis [42] suggests that it may be possible to 
achieve similar results if the upper airway were targeted in 
allergic rhinitis treatment. 

Our GWAS of allergic rhinitis was the first to examine 
ethnically diverse subjects for this complex trait, and our 
study revealed susceptibility loci that were specific to ethni- 
city. This is consistent with genome-wide association stud- 
ies of other complex diseases- such as asthma [25,54] and 
obesity [55,56]- that have also demonstrated ethnicity- 
specific effects. Given the possibility that allergic rhinitis 
with comorbid asthma vs. allergic rhinitis without comor- 
bid asthma may be distinct disease subphenotypes [32], we 
also performed secondary GWAS analyses stratified by 
asthma status. These results similarly showed ethnicity 
specific findings. Our findings support the utility of 
studying ancestrally-diverse populations in genome-wide 
studies. 

We recognize the limitations of our study. We defined 
allergic rhinitis using criteria commonly employed in 
population-based and genetic studies of allergic rhinitis, 
which are based on questionnaire without objective 
markers [1,4,9]. For our eSNP and coexpression network 
analyses, it would have been ideal to have profiled gene ex- 
pression for all 5633 subjects who participated in the 
GWAS, but CD4+ lymphocytes were not available from all 
subjects. We had CD4+ lymphocytes from European- 
American CAMP subjects only, and it is possible that coex- 
pression results would have differed had we additionally 
had expression profiles from subjects of other ethnic back- 
grounds, as gene expression can vary by ethnicity. While 
expression differences can change with ethnicity, the 
connectivity structure is expected to be much more highly 
conserved, however, and is even seen across species [57]. 
Additionally, we recognize that our coexpression network 
may have yielded distinct results had we chosen a different 
tissue for gene expression profiling; we had chosen to study 
peripheral blood CD4+ lymphocytes given their central role 
in allergic disease [30]. Despite these limitations, we were 
able to implement an integrative analysis of our GWAS, 
coexpression network, and eSNP results, leading to the 
identification of a novel biologic pathway in allergic rhinitis. 
Our strategy created an informed biological context for our 
GWAS that may be used to better understand allergic rhin- 
itis. Further, our methods may be implemented to provide 
biologic context for GWAS of other diseases. 

Conclusions 

Our GWAS of allergic rhinitis of 5633 ethnically diverse 
subjects demonstrated ethnicity-specific, genome-wide 
significant findings. To determine the potential biological 
impact of the variants identified in our GWAS, we inte- 
grated eSNP, coexpression network, and pathway analyses 
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using gene expression data generated from subjects who 
had undergone GWAS. Our integrated approach identified 
mitochondrial pathways as important in allergic rhinitis, and 
our strategy may prove useful to studying other diseases. 

Methods 

Ethics statement 

Each study was approved by the Institutional Review 
Board of the corresponding institution. Informed con- 
sent was obtained for all study participants, and where 
appropriate, informed assent from minors and informed 
consent from their parent were obtained. 

Subjects, genotyping, and phenotyping 

Subjects were recruited from EVE Consortium centers in 
the United States, Mexico, and Barbados. Detailed de- 
scriptions of the individual studies, genotyping platforms, 
and quality control protocols have been previously de- 
scribed [25]. Of note, SNPs with imputation quality scores 
below a threshold (Rsq < 0.3) were removed from the ana- 
lysis. We included subjects who were specifically assessed 
for allergic rhinitis, and these came from 7 study centers 
(Figure 1). Allergic rhinitis status was considered positive 
if a subject reported a history of allergic rhinitis ever, de- 
fined as hay fever or runny/stuffy nose with sneezing or 
itching when the subject did not also have a cold or flu. 

GWAS and meta-analysis 

Summary files on a common set of SNPs were shared 
among the EVE Consortium investigators. Genotype im- 
putation using HapMap reference panels and Markov 
Chain Haplo typing software (MaCH) [58] were per- 
formed in each sample as previously described [25]. We 
pooled the imputed genotype data for each ethnic group 
(European American, Latino, African American/ African 
Caribbean) (Figure 1). To adjust for potential population 
stratification, we used Eigenstrat [59] to create principal 
components for each ethnic group. Within each ethnic 
group, we tested for the association of SNPs with allergic 
rhinitis by constructing a test statistic that had a stand- 
ard normal distribution under the null hypothesis of no 
association and captured the direction of the effect. 
Models were implemented in PLINK [60] and controlled 
for age, sex, and principal components. To allow for com- 
parability with previous GWAS of allergic rhinitis [4,8,9], 
we did not include asthma status as a covariate. To assess 
for the effects of SNPs across ethnic groups, we then cal- 
culated a meta-analysis statistic as a combination of the 
individual ethnic study scores using METAL [61]. 

Recognizing the potential subphenotypes of isolated 
allergic rhinitis vs. allergic rhinitis with comorbid asthma 
[1,32], we additionally performed secondary GWAS and 
meta-analyses in subjects stratified by asthma status 
using methods analogous to the above. 



Genome-wide CD4+ gene expression 

We collected peripheral blood CD4+ lymphocytes from 
200 subjects who had undergone GWAS. These 200 
subjects were from Childhood Asthma Management Co- 
hort (CAMP) cohort [62], one of the member centers of 
the EVE Consortium (Figure 1). We focused on this 
sample subset because of biospecimen availability. Per- 
ipheral blood was collected into BD Vacutainer CPT 
tubes (BD Diagnostics, Franklin Lakes, New Jersey) and 
placed on ice. Samples were centrifuged within 1 hour 
of collection for 20 minutes at 1700RCF, followed by 
mononuclear cell layer isolation and suspension in 10 ml 
of PBS. We isolated CD4+ lymphocytes using anti-CD4+ 
microbeads by column separation (Miltenyi Biotec, 
Auburn, CA) using 20 \A anti-CD4+ Micro beads per 106 
total cells. To extract total RNA, we used the RNeasy Mini 
Protocol (QIAGEN, Valencia, CA) and stored at -80°C. 
We generated expression profiles with the Illumina 
HumanRef8 v2 BeadChip arrays (Illumina, San Diego 
CA). Expression data were log2 transformed and quan- 
tile normalized. 

Coexpression network analysis 

We performed weighted gene coexpression network ana- 
lysis to identify coexpressed gene modules [31]. We used 
a previously applied, well-established, well-recognized, 
and validated method to construct the coexpression net- 
work [14,28,31,33-40]. For module detection, we used 
average linkage hierarchical clustering of a topological 
overlap matrix based on an adjacency matrix that is 
comprised of power-transformed correlations between 
gene expression profiles [31]. To cut branches of the tree 
into gene modules, we used the dynamic tree cutting al- 
gorithm, which iteratively searches for stable branch 
sizes and chooses clusters based on the shape of each 
dendrogram branch [63]. This algorithm allows manipu- 
lation of several parameters controlling the resultant 
cluster size and cohesiveness. The modules identified 
from the coexpression network were then carried for- 
ward into the integrative analysis. 

To provide support for the specificity of our coexpres- 
sion network, we generated multiple random coexpres- 
sion networks where gene assignments were randomized 
(Random Networks 1-3), as well as random networks 
where the gene expression levels were randomized 
(Random Networks 4-6). We carried these random 
networks forward into the integrated analysis as well. 

Integration of GWAS and CD4+ gene expression 

We defined a GWAS locus (P value for association 
< 1 x 10" 6 ) as tagging a coexpression module if a 
coexpression module contained a gene within 250 kb 
of the locus. Coexpression modules tagged by GWAS 
loci were identified as candidate allergic rhinitis 
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associated modules, and then assessed for enrichment 
of eSNPs that were also nominally associated with al- 
lergic rhinitis. A SNP was considered an eSNP if the 
SNP was located within 1 megabase of the corre- 
sponding gene, and the association between genotype 
and gene expression was significant at a 10% false 
discovery rate (FDR) (P value < 1 x 10" 4 ). For modules 
tagged by at least 1 GWAS locus, we used the 
Fishers exact test to assess whether a module was 
enriched for eSNPs that were also nominally associ- 
ated (P value < 0.01) with allergic rhinitis. The com- 
position of modules was then assessed by pathway 
analysis using defined gene ontologies (GO) via the 
DAVID analysis tool [64,65]. Overrepresentation of 
canonical pathways and biological processes in modules 
was measured via the Fisher s exact test. P values from this 
test were FDR-adjusted given the number of modules and 
functional categories tested. Networks were visualized 
using the Cytoscape network visualization tool [66]. 
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